<!DOCTYPE HTML>
<html lang="en-US" >
    
    <head>
        
        <meta charset="UTF-8">
        <title>Analysis | Elasticsearch 权威指南</title>
        <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
        <meta name="description" content="">
        <meta name="generator" content="GitBook 1.0.3">
        <meta name="HandheldFriendly" content="true"/>
        <meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no">
        <meta name="apple-mobile-web-app-capable" content="yes">
        <meta name="apple-mobile-web-app-status-bar-style" content="black">
        <link rel="apple-touch-icon-precomposed" sizes="152x152" href="../gitbook/images/apple-touch-icon-precomposed-152.png">
        <link rel="shortcut icon" href="../gitbook/images/favicon.ico" type="image/x-icon">
        
    
    
    
    <link rel="next" href="../mapping_analysis/mapping.html" />
    
    
    <link rel="prev" href="../mapping_analysis/inverted_index.html" />
    

        
    </head>
    <body>
        
        
<link rel="stylesheet" href="../gitbook/style.css">


        
    <div class="book"  data-level="6.3" data-basepath=".." data-revision="1436390985808">
    

<div class="book-summary">
    <div class="book-search">
        <input type="text" placeholder="Type to search" class="form-control" />
    </div>
    <ul class="summary">
        
    	
    	
    	

        

        
    
        
        <li class="chapter " data-level="0" data-path="index.html">
            
                
                    <a href="../index.html">
                        <i class="fa fa-check"></i>
                        
                         Introduction
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="1" data-path="getting_started/README.html">
            
                
                    <a href="../getting_started/README.html">
                        <i class="fa fa-check"></i>
                        
                            <b>1.</b>
                        
                         入门
                    </a>
                
            
            
            <ul class="articles">
                
    
        
        <li class="chapter " data-level="1.1" data-path="getting_started/what_is_it.html">
            
                
                    <a href="../getting_started/what_is_it.html">
                        <i class="fa fa-check"></i>
                        
                            <b>1.1.</b>
                        
                         初识
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="1.2" data-path="getting_started/installing_es.html">
            
                
                    <a href="../getting_started/installing_es.html">
                        <i class="fa fa-check"></i>
                        
                            <b>1.2.</b>
                        
                         安装
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="1.3" data-path="getting_started/api.html">
            
                
                    <a href="../getting_started/api.html">
                        <i class="fa fa-check"></i>
                        
                            <b>1.3.</b>
                        
                         API
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="1.4" data-path="getting_started/document.html">
            
                
                    <a href="../getting_started/document.html">
                        <i class="fa fa-check"></i>
                        
                            <b>1.4.</b>
                        
                         文档
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="1.5" data-path="getting_started/tutorial_indexing.html">
            
                
                    <a href="../getting_started/tutorial_indexing.html">
                        <i class="fa fa-check"></i>
                        
                            <b>1.5.</b>
                        
                         索引
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="1.6" data-path="getting_started/tutorial_search.html">
            
                
                    <a href="../getting_started/tutorial_search.html">
                        <i class="fa fa-check"></i>
                        
                            <b>1.6.</b>
                        
                         搜索
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="1.7" data-path="getting_started/tutorial_aggregations.html">
            
                
                    <a href="../getting_started/tutorial_aggregations.html">
                        <i class="fa fa-check"></i>
                        
                            <b>1.7.</b>
                        
                         汇总
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="1.8" data-path="getting_started/tutorial_conclusion.html">
            
                
                    <a href="../getting_started/tutorial_conclusion.html">
                        <i class="fa fa-check"></i>
                        
                            <b>1.8.</b>
                        
                         小结
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="1.9" data-path="getting_started/distributed.html">
            
                
                    <a href="../getting_started/distributed.html">
                        <i class="fa fa-check"></i>
                        
                            <b>1.9.</b>
                        
                         分布式
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="1.10" data-path="getting_started/conclusion.html">
            
                
                    <a href="../getting_started/conclusion.html">
                        <i class="fa fa-check"></i>
                        
                            <b>1.10.</b>
                        
                         本章总结
                    </a>
                
            
            
        </li>
    

            </ul>
            
        </li>
    
        
        <li class="chapter " data-level="2" data-path="distributed_cluster/README.html">
            
                
                    <a href="../distributed_cluster/README.html">
                        <i class="fa fa-check"></i>
                        
                            <b>2.</b>
                        
                         分布式集群
                    </a>
                
            
            
            <ul class="articles">
                
    
        
        <li class="chapter " data-level="2.1" data-path="distributed_cluster/empty_cluster.html">
            
                
                    <a href="../distributed_cluster/empty_cluster.html">
                        <i class="fa fa-check"></i>
                        
                            <b>2.1.</b>
                        
                         空集群
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="2.2" data-path="distributed_cluster/cluster_health.html">
            
                
                    <a href="../distributed_cluster/cluster_health.html">
                        <i class="fa fa-check"></i>
                        
                            <b>2.2.</b>
                        
                         集群健康
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="2.3" data-path="distributed_cluster/add_an_index.html">
            
                
                    <a href="../distributed_cluster/add_an_index.html">
                        <i class="fa fa-check"></i>
                        
                            <b>2.3.</b>
                        
                         添加索引
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="2.4" data-path="distributed_cluster/add_failover.html">
            
                
                    <a href="../distributed_cluster/add_failover.html">
                        <i class="fa fa-check"></i>
                        
                            <b>2.4.</b>
                        
                         容错移转
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="2.5" data-path="distributed_cluster/scale_horizontally.html">
            
                
                    <a href="../distributed_cluster/scale_horizontally.html">
                        <i class="fa fa-check"></i>
                        
                            <b>2.5.</b>
                        
                         横向扩展
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="2.6" data-path="distributed_cluster/scale_more.html">
            
                
                    <a href="../distributed_cluster/scale_more.html">
                        <i class="fa fa-check"></i>
                        
                            <b>2.6.</b>
                        
                         扩展
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="2.7" data-path="distributed_cluster/coping_with_failure.html">
            
                
                    <a href="../distributed_cluster/coping_with_failure.html">
                        <i class="fa fa-check"></i>
                        
                            <b>2.7.</b>
                        
                         故障恢复
                    </a>
                
            
            
        </li>
    

            </ul>
            
        </li>
    
        
        <li class="chapter " data-level="3" data-path="data/README.html">
            
                
                    <a href="../data/README.html">
                        <i class="fa fa-check"></i>
                        
                            <b>3.</b>
                        
                         数据
                    </a>
                
            
            
            <ul class="articles">
                
    
        
        <li class="chapter " data-level="3.1" data-path="data/document.html">
            
                
                    <a href="../data/document.html">
                        <i class="fa fa-check"></i>
                        
                            <b>3.1.</b>
                        
                         文档
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="3.2" data-path="data/index.html">
            
                
                    <a href="../data/index.html">
                        <i class="fa fa-check"></i>
                        
                            <b>3.2.</b>
                        
                         索引
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="3.3" data-path="data/get.html">
            
                
                    <a href="../data/get.html">
                        <i class="fa fa-check"></i>
                        
                            <b>3.3.</b>
                        
                         Get
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="3.4" data-path="data/exists.html">
            
                
                    <a href="../data/exists.html">
                        <i class="fa fa-check"></i>
                        
                            <b>3.4.</b>
                        
                         存在
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="3.5" data-path="data/update.html">
            
                
                    <a href="../data/update.html">
                        <i class="fa fa-check"></i>
                        
                            <b>3.5.</b>
                        
                         更新
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="3.6" data-path="data/create.html">
            
                
                    <a href="../data/create.html">
                        <i class="fa fa-check"></i>
                        
                            <b>3.6.</b>
                        
                         创建
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="3.7" data-path="data/delete.html">
            
                
                    <a href="../data/delete.html">
                        <i class="fa fa-check"></i>
                        
                            <b>3.7.</b>
                        
                         删除
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="3.8" data-path="data/version_control.html">
            
                
                    <a href="../data/version_control.html">
                        <i class="fa fa-check"></i>
                        
                            <b>3.8.</b>
                        
                         版本控制
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="3.9" data-path="data/partial_update.html">
            
                
                    <a href="../data/partial_update.html">
                        <i class="fa fa-check"></i>
                        
                            <b>3.9.</b>
                        
                         局部更新
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="3.10" data-path="data/mget.html">
            
                
                    <a href="../data/mget.html">
                        <i class="fa fa-check"></i>
                        
                            <b>3.10.</b>
                        
                         Mget
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="3.11" data-path="data/bulk.html">
            
                
                    <a href="../data/bulk.html">
                        <i class="fa fa-check"></i>
                        
                            <b>3.11.</b>
                        
                         Bulk
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="3.12" data-path="data/conclusion.html">
            
                
                    <a href="../data/conclusion.html">
                        <i class="fa fa-check"></i>
                        
                            <b>3.12.</b>
                        
                         总结
                    </a>
                
            
            
        </li>
    

            </ul>
            
        </li>
    
        
        <li class="chapter " data-level="4" data-path="distributed_crud/README.html">
            
                
                    <a href="../distributed_crud/README.html">
                        <i class="fa fa-check"></i>
                        
                            <b>4.</b>
                        
                         分布式文档存储
                    </a>
                
            
            
            <ul class="articles">
                
    
        
        <li class="chapter " data-level="4.1" data-path="distributed_crud/routing.html">
            
                
                    <a href="../distributed_crud/routing.html">
                        <i class="fa fa-check"></i>
                        
                            <b>4.1.</b>
                        
                         路由
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="4.2" data-path="distributed_crud/shard_interaction.html">
            
                
                    <a href="../distributed_crud/shard_interaction.html">
                        <i class="fa fa-check"></i>
                        
                            <b>4.2.</b>
                        
                         主从互通
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="4.3" data-path="distributed_crud/create_index_delete.html">
            
                
                    <a href="../distributed_crud/create_index_delete.html">
                        <i class="fa fa-check"></i>
                        
                            <b>4.3.</b>
                        
                         创建索引删除
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="4.4" data-path="distributed_crud/retrieving.html">
            
                
                    <a href="../distributed_crud/retrieving.html">
                        <i class="fa fa-check"></i>
                        
                            <b>4.4.</b>
                        
                         获取
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="4.5" data-path="distributed_crud/partial_updates.html">
            
                
                    <a href="../distributed_crud/partial_updates.html">
                        <i class="fa fa-check"></i>
                        
                            <b>4.5.</b>
                        
                         局部更新
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="4.6" data-path="distributed_crud/bulk_requests.html">
            
                
                    <a href="../distributed_crud/bulk_requests.html">
                        <i class="fa fa-check"></i>
                        
                            <b>4.6.</b>
                        
                         批量请求
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="4.7" data-path="distributed_crud/bulk_format.html">
            
                
                    <a href="../distributed_crud/bulk_format.html">
                        <i class="fa fa-check"></i>
                        
                            <b>4.7.</b>
                        
                         批量格式
                    </a>
                
            
            
        </li>
    

            </ul>
            
        </li>
    
        
        <li class="chapter " data-level="5" data-path="search/README.html">
            
                
                    <a href="../search/README.html">
                        <i class="fa fa-check"></i>
                        
                            <b>5.</b>
                        
                         搜索
                    </a>
                
            
            
            <ul class="articles">
                
    
        
        <li class="chapter " data-level="5.1" data-path="search/empty_search.html">
            
                
                    <a href="../search/empty_search.html">
                        <i class="fa fa-check"></i>
                        
                            <b>5.1.</b>
                        
                         空白搜索
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="5.2" data-path="search/multi_index_multi_type.html">
            
                
                    <a href="../search/multi_index_multi_type.html">
                        <i class="fa fa-check"></i>
                        
                            <b>5.2.</b>
                        
                         多索引多类型
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="5.3" data-path="search/pagination.html">
            
                
                    <a href="../search/pagination.html">
                        <i class="fa fa-check"></i>
                        
                            <b>5.3.</b>
                        
                         分页
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="5.4" data-path="search/query_string.html">
            
                
                    <a href="../search/query_string.html">
                        <i class="fa fa-check"></i>
                        
                            <b>5.4.</b>
                        
                         查询语句
                    </a>
                
            
            
        </li>
    

            </ul>
            
        </li>
    
        
        <li class="chapter " data-level="6" data-path="mapping_analysis/README.html">
            
                
                    <a href="../mapping_analysis/README.html">
                        <i class="fa fa-check"></i>
                        
                            <b>6.</b>
                        
                         映射与统计
                    </a>
                
            
            
            <ul class="articles">
                
    
        
        <li class="chapter " data-level="6.1" data-path="mapping_analysis/exact_vs_full_text.html">
            
                
                    <a href="../mapping_analysis/exact_vs_full_text.html">
                        <i class="fa fa-check"></i>
                        
                            <b>6.1.</b>
                        
                         Exact_vs_full_text
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="6.2" data-path="mapping_analysis/inverted_index.html">
            
                
                    <a href="../mapping_analysis/inverted_index.html">
                        <i class="fa fa-check"></i>
                        
                            <b>6.2.</b>
                        
                         Inverted_index
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter active" data-level="6.3" data-path="mapping_analysis/analysis.html">
            
                
                    <a href="../mapping_analysis/analysis.html">
                        <i class="fa fa-check"></i>
                        
                            <b>6.3.</b>
                        
                         Analysis
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="6.4" data-path="mapping_analysis/mapping.html">
            
                
                    <a href="../mapping_analysis/mapping.html">
                        <i class="fa fa-check"></i>
                        
                            <b>6.4.</b>
                        
                         Mapping
                    </a>
                
            
            
        </li>
    
        
        <li class="chapter " data-level="6.5" data-path="mapping_analysis/complex_datatypes.html">
            
                
                    <a href="../mapping_analysis/complex_datatypes.html">
                        <i class="fa fa-check"></i>
                        
                            <b>6.5.</b>
                        
                         Complex_datatypes
                    </a>
                
            
            
        </li>
    

            </ul>
            
        </li>
    


        
        <li class="divider"></li>
        <li>
            <a href="http://www.gitbook.io/" target="blank" class="gitbook-link">Published using GitBook</a>
        </li>
        
    </ul>
</div>

    <div class="book-body">
        <div class="body-inner">
            <div class="book-header">
    <!-- Actions Left -->
    <a href="#" class="btn pull-left toggle-summary" aria-label="Toggle summary"><i class="fa fa-align-justify"></i></a>
    <a href="#" class="btn pull-left toggle-search" aria-label="Toggle search"><i class="fa fa-search"></i></a>
    
    <div id="font-settings-wrapper" class="dropdown pull-left">
        <a href="#" class="btn toggle-dropdown" aria-label="Toggle font settings"><i class="fa fa-font"></i>
        </a>
        <div class="dropdown-menu font-settings">
    <div class="dropdown-caret">
        <span class="caret-outer"></span>
        <span class="caret-inner"></span>
    </div>

    <div class="buttons">
        <button type="button" id="reduce-font-size" class="button size-2">A</button>
        <button type="button" id="enlarge-font-size" class="button size-2">A</button>
    </div>

    <div class="buttons font-family-list">
        <button type="button" data-font="0" class="button">Serif</button>
        <button type="button" data-font="1" class="button">Sans</button>
    </div>

    <div class="buttons color-theme-list">
        <button type="button" id="color-theme-preview-0" class="button size-3" data-theme="0">White</button>
        <button type="button" id="color-theme-preview-1" class="button size-3" data-theme="1">Sepia</button>
        <button type="button" id="color-theme-preview-2" class="button size-3" data-theme="2">Night</button>
    </div>
</div>

    </div>

    <!-- Actions Right -->
    
    <div class="dropdown pull-right">
        <a href="#" class="btn toggle-dropdown" aria-label="Toggle share dropdown"><i class="fa fa-share-alt"></i>
        </a>
        <div class="dropdown-menu font-settings dropdown-left">
            <div class="dropdown-caret">
                <span class="caret-outer"></span>
                <span class="caret-inner"></span>
            </div>
            <div class="buttons">
                <button type="button" data-sharing="twitter" class="button">Twitter</button>
                <button type="button" data-sharing="google-plus" class="button">Google</button>
                <button type="button" data-sharing="facebook" class="button">Facebook</button>
                <button type="button" data-sharing="weibo" class="button">Weibo</button>
                <button type="button" data-sharing="instapaper" class="button">Instapaper</button>
            </div>
        </div>
    </div>
    

    
    <a href="#" target="_blank" class="btn pull-right google-plus-sharing-link sharing-link" data-sharing="google-plus" aria-label="Share on Google Plus"><i class="fa fa-google-plus"></i></a>
    
    
    <a href="#" target="_blank" class="btn pull-right facebook-sharing-link sharing-link" data-sharing="facebook" aria-label="Share on Facebook"><i class="fa fa-facebook"></i></a>
    
    
    <a href="#" target="_blank" class="btn pull-right twitter-sharing-link sharing-link" data-sharing="twitter" aria-label="Share on Twitter"><i class="fa fa-twitter"></i></a>
    
    

    <!-- Title -->
    <h1>
        <i class="fa fa-circle-o-notch fa-spin"></i>
        <a href="../" >Elasticsearch 权威指南</a>
    </h1>
</div>

            <div class="page-wrapper" tabindex="-1">
                <div class="page-inner">
                
                
                    <section class="normal" id="section-gitbook_902">
                    
                        <p>[[analysis-intro]]
=== Analysis and analyzers</p>
<p><em>Analysis</em> is the process of:</p>
<ul>
<li>first, tokenizing a block of text into
individual <em>terms</em> suitable for use in an inverted index,</li>
<li>then normalizing these terms into a standard form to improve their
``searchability&#39;&#39; or <em>recall</em>.</li>
</ul>
<p>This job is performed by <em>analyzers</em>. An <em>analyzer</em> is really just a wrapper
which combines three functions into a single package:</p>
<p>Character filters::</p>
<pre><code>First, the string is passed through any _character filters_ in turn. Their
job is to tidy up the string before tokenization. A character filter could
be used to strip out HTML, or to convert `&quot;&amp;&quot;` characters to the word
`&quot;and&quot;`.
</code></pre><p>Tokenizer::</p>
<p>   Next, the string is tokenized into individual terms by a <em>tokenizer</em>. A
   simple tokenizer might split the text up into terms whenever it encounters
   whitespace or punctuation.</p>
<p>Token filters::</p>
<p>   Last, each term is passed through any <em>token filters</em> in turn, which can
   change terms (eg lowercasing <code>&quot;Quick&quot;</code>), remove terms (eg stopwords like
   <code>&quot;a&quot;</code>, <code>&quot;and&quot;</code>, <code>&quot;the&quot;</code> etc) or add terms (eg synonyms like <code>&quot;jump&quot;</code> and
   <code>&quot;leap&quot;</code>)</p>
<p>Elasticsearch provides many character filters, tokenizers and token filters
out of the box. These can be combined to create custom analyzers suitable
for different purposes. We will discuss these in detail in &lt;<custom-analyzers>&gt;.</p>
<p>==== Built-in analyzers</p>
<p>However, Elasticsearch also ships with a number of pre-packaged analyzers that
you can use directly. We list the most important ones below and, to demonstrate
the difference in behaviour, we show what terms each would produce
from this string:</p>
<pre><code>&quot;Set the shape to semi-transparent by calling set_trans(5)&quot;
</code></pre><p>Standard analyzer::</p>
<p>The standard analyzer is the default analyzer that Elasticsearch uses. It is
the best general choice for analyzing text which may be in any language. It
splits the text on <em>word boundaries</em>, as defined by the
<a href="http://www.unicode.org/reports/tr29/[Unicode" target="_blank">http://www.unicode.org/reports/tr29/[Unicode</a> Consortium], and removes most
punctuation. Finally, it lowercases all terms. It would produce:
+
    set, the, shape, to, semi, transparent, by, calling, set_trans, 5</p>
<p>Simple analyzer::</p>
<p>The simple analyzer splits the text on anything that isn&#39;t a letter,
and lowercases the terms. It would produce:
+
    set, the, shape, to, semi, transparent, by, calling, set, trans</p>
<p>Whitespace analyzer::</p>
<p>The whitespace analyzer splits the text on whitespace. It doesn&#39;t
lowercase. It would produce:
+
    Set, the, shape, to, semi-transparent, by, calling, set_trans(5)</p>
<p>Language analyzers::</p>
<p>Language-specific analyzers are available for many languages. They are able to
take the peculiarities of the specified language into account. For instance,
the <code>english</code> analyzer comes with a set of English stopwords -- common words
like <code>and</code> or <code>the</code> which don&#39;t have much impact on relevance -- which it
removes, and it is able to <em>stem</em> English words because it understands the
rules of English grammar.
+
The <code>english</code> analyzer would produce the following:
+
    set, shape, semi, transpar, call, set_tran, 5
+
Note how <code>&quot;transparent&quot;</code>, <code>&quot;calling&quot;</code>, and <code>&quot;set_trans&quot;</code> have been stemmed to
their root form.</p>
<p>==== When analyzers are used</p>
<p>When we <em>index</em> a document, its full text fields are analyzed into terms which
are used to create the inverted index.  However, when we <em>search</em> on a full
text field,  we need to pass the query string through the <em>same analysis
process</em>, to ensure that we are searching for terms in the same form as those
that exist in the index.</p>
<p>Full text queries, which we will discuss later, understand how each field is
defined, and so they can do the right thing:</p>
<ul>
<li><p>When you query a <em>full text</em> field, the query will apply the same analyzer
to the query string to produce the correct list of terms to search for.</p>
</li>
<li><p>When you query an <em>exact value</em> field, the query will not analyze the
query string, but instead search for the exact value that you have
specified.</p>
</li>
</ul>
<p>Now you can understand why the queries that we demonstrated at the
&lt;<mapping-analysis,start of this chapter>&gt; return what they do:</p>
<ul>
<li>The <code>date</code> field contains an exact value: the single term <code>&quot;2014-09-15&quot;</code>.</li>
<li>The <code>_all</code> field is a full text field, so the analysis process has
converted the date into the three terms: <code>&quot;2014&quot;</code>, <code>&quot;09&quot;</code> and <code>&quot;15&quot;</code>.</li>
</ul>
<p>When we query the <code>_all</code> field for <code>2014</code>, it matches all 12 tweets, because
all of them contain the term <code>2014</code>:</p>
<h2 id="sourcesh">[source,sh]</h2>
<h2 id="get-_searchq2014---------------12-results">GET /_search?q=2014              # 12 results</h2>
<p>// SENSE: 052_Mapping_Analysis/25_Data_type_differences.json</p>
<p>When we query the <code>_all</code> field for <code>2014-09-15</code>, it first analyzes the query
string to produce a query which matches <em>any</em> of the terms <code>2014</code>, <code>09</code> or
<code>15</code>. This also matches all 12 tweets, because all of them contain the term
<code>2014</code>:</p>
<h2 id="sourcesh">[source,sh]</h2>
<h2 id="get-_searchq2014-09-15---------12-results-">GET /_search?q=2014-09-15        # 12 results !</h2>
<p>// SENSE: 052_Mapping_Analysis/25_Data_type_differences.json</p>
<p>When we query the <code>date</code> field for <code>2014-09-15</code>, it looks for that <em>exact</em>
date, and finds one tweet only:</p>
<h2 id="sourcesh">[source,sh]</h2>
<h2 id="get-_searchqdate2014-09-15----1--result">GET /_search?q=date:2014-09-15   # 1  result</h2>
<p>// SENSE: 052_Mapping_Analysis/25_Data_type_differences.json</p>
<p>When we query the <code>date</code> field for <code>2014</code>, it finds no documents
because none contain that exact date:</p>
<h2 id="sourcesh">[source,sh]</h2>
<h2 id="get-_searchqdate2014----------0--results-">GET /_search?q=date:2014         # 0  results !</h2>
<p>// SENSE: 052_Mapping_Analysis/25_Data_type_differences.json</p>
<p>[[analyze-api]]
==== Testing analyzers</p>
<p>Especially when you are new to Elasticsearch, it is sometimes difficult to
understand what is actually being tokenized and stored into your index.  To
better understand what is going on, you can use the <code>analyze</code> API to see how
text is analyzed. Specify which analyzer to use in the query string
parameters,  and the text to analyze in the body:</p>
<h2 id="sourcejs">[source,js]</h2>
<p>GET /_analyze?analyzer=standard</p>
<h2 id="text-to-analyze">Text to analyze</h2>
<p>// SENSE: 052_Mapping_Analysis/40_Analyze.json</p>
<p>Each element in the result represents a single term:</p>
<h2 id="sourcejs">[source,js]</h2>
<p>{
   &quot;tokens&quot;: [
      {
         &quot;token&quot;:        &quot;text&quot;,
         &quot;start_offset&quot;: 0,
         &quot;end_offset&quot;:   4,
         &quot;type&quot;:         &quot;<ALPHANUM>&quot;,
         &quot;position&quot;:     1
      },
      {
         &quot;token&quot;:        &quot;to&quot;,
         &quot;start_offset&quot;: 5,
         &quot;end_offset&quot;:   7,
         &quot;type&quot;:         &quot;<ALPHANUM>&quot;,
         &quot;position&quot;:     2
      },
      {
         &quot;token&quot;:        &quot;analyze&quot;,
         &quot;start_offset&quot;: 8,
         &quot;end_offset&quot;:   15,
         &quot;type&quot;:         &quot;<ALPHANUM>&quot;,
         &quot;position&quot;:     3
      }
   ]</p>
<h2 id="">}</h2>
<p>The <code>token</code> is the actual term that will be stored in the index. The
<code>position</code> indicates the order in which the terms appeared in the original
text. The <code>start_offset</code> and <code>end_offset</code> indicate the character positions
that the original word occupied in the original string.</p>
<p>The <code>analyze</code> API is really useful tool for understanding what is happening
inside Elasticsearch indices, and we will talk more about it as we progress.</p>
<p>==== Specifying analyzers</p>
<p>When Elasticsearch detects a new string field in your documents, it
automatically configures it as a full text <code>string</code> field and analyzes it with
the <code>standard</code> analyzer.</p>
<p>You don&#39;t always want this. Perhaps you want to apply a different analyzer
which suits the language your data is in. And sometimes you want a
string field to be just a string field -- to index the exact value that
you pass in, without any analysis, such as a string user ID or an
internal status field or tag.</p>
<p>In order to achieve this, we have to configure these fields manually
by specifying the <em>mapping</em>.</p>

                    
                    </section>
                
                
                </div>
            </div>
        </div>

        
        <a href="../mapping_analysis/inverted_index.html" class="navigation navigation-prev " aria-label="Previous page: Inverted_index"><i class="fa fa-angle-left"></i></a>
        
        
        <a href="../mapping_analysis/mapping.html" class="navigation navigation-next " aria-label="Next page: Mapping"><i class="fa fa-angle-right"></i></a>
        
    </div>
</div>

        
<script src="../gitbook/app.js"></script>

    
    <script src="https://cdn.mathjax.org/mathjax/2.4-latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
    

    
    <script src="../gitbook/plugins/gitbook-plugin-mathjax/plugin.js"></script>
    

<script>
require(["gitbook"], function(gitbook) {
    var config = {"fontSettings":{"theme":null,"family":"sans","size":2}};
    gitbook.start(config);
});
</script>

        
    </body>
    
</html>
